HTTP Server workflow

An HTTP Server workflow is one that has one or more processes that always start with the HTTP Server Input task and returns something to a client using a web browser. Each process would have a specific task referred to as an "action", called from the browser itself.

HTTP Server Input tasks are typically used in one of the two following situations:

  • HTML Form Action: An HTML Form in the browser that may contain text and attached files can be filled and sent to a process with the HTTP Server Input task.
  • HTTP Data Submission: A custom application or a server sends the request to PReS Workflow using either a POST or GET command. The application or server then waits for a response from PReS Workflow.

PReS Workflow can serve both static and dynamic resources to a web browser, however it is not meant to be used as a fully featured web server, as it is not built for responsiveness nor guaranteed uptime. It is much better to have a common web server (for example, IIS or Apache) to serve your contents and to have PReS Workflow available only to process things only it can do. For more information on how to serve HTML and PDF generated by Connect through IIS, watch the Connect with Evie - IIS series.

Tip: Essentially the NodeJS Server Input task does the same as the HTTP Server Input task, but it uses a NodeJS Server (installed by Workflow) instead of Workflow's custom server component. The NodeJS Server Input task is more secure, more up to date and more standardized.
It is configured using its three settings dialogs in the Preferences (Workflow button > Preferences).

Note: You can control access to the PReS Workflow Tools HTTP Server via the Access Manager.

Important configuration, setup and options

Before starting to work with HTTP workflows, there are a few key points to keep in mind in terms of configuration.

First of all, the following options are available in the PReS Workflow Preference screen, under the HTTP Server Input 1 and HTTP Server Input 2 sections:

  • Port (default value: 8080 recommended): The port number is the one in which a browser needs to make a request to PReS Workflow. By default in most web servers, port 80 is used and, when this is the case, it is not necessary to include it. For example, if I type http://www.objectiflune.com/ in my browser, it is actually accessing the address http://www.objectiflune.com:80/ , but port 80 is always hidden. The reason port 8080 is used by default is to prevent any interference with existing web servers installed or activated on the same server as PReS Workflow.
  • Time-out (seconds): This determines how long the HTTP Server service will wait for the process to finish, before returning a time out error back to the client browser. This means that if a process takes more than 120 seconds (by default) to complete, the browser will time out. While you can change this value, it is recommended to always keep your processing to a minimum, since both browsers and users generally frown upon being stopped for more than a minute, unless they are well aware of this processing time (and even then...)
  • Enable server for SSL requests: This enables secure communication between the browser and the server via HTTPS. By enabling this option, you will need to provide for the proper certificates, key and password (see also: Obtaining a certificate). While this configuration is beyond the scope of this documentation, there are plenty of resources on the Internet to explain these systems.
  • Serve HTTP resources: This is where you enable static resources in PReS Workflow. When enabling this option, the HTTP server will always look in the Resource Folder for files requested inside of the Resource action name as a folder. This means that, if your Resource folder is c:\PReS\http and your Resource action name is static, pointing your browser to http://127.0.0.1:8080/static/css/style.css will immediately load and return the file c:\PReS\http\css\style.css . This does not require any process to work - everything is handled directly by the HTTP Server Input and files are returned immediately. This feature is very useful when dealing with stylesheets, images, browser JavaScript, or static HTML files that do not require any processing.

Note: It is possible to serve a default HTML file when no action is specified, for example http://localhost:8080/ . This is done by creating an index.html file in the Resource Folder defined above. However, resources called by this index.html must still use the Resource action name, for example a stylesheet would still point to http://127.0.0.1:8080/static/css/style.css or more simply static/css/style.css.

You also need to take into consideration the options inside each of your processes that start with the HTTP Server Input task, as they will greatly impact how this process responds. In the process's properties, the following options will modify HTTP behavior:

  • Self-Replicating Process: This option is critically important when dealing with HTTP processes. Basically, this means that when HTTP requests are received, the process will duplicate itself up to the specified maximum number, in order to simultaneously (and asynchronously) handle multiple requests. See Process properties for more details.
  • As soon as possible: This option needs to be checked, otherwise requests will not be handled as they come in (this option is meant to be used on scheduled processes that run at intervals).
  • Polling Interval (sec): This option determines how much time the HTTP Server Input waits between the moment it finishes processing a request and the moment it picks up a new request. This should be put at 0 in order to process requests as soon as possible, meaning immediately.

And finally, the HTTP Server Input task properties. While these are described in the HTTP Server Input task properties page, here are a few considerations to keep in mind when using this task:

  • The HTTP Action corresponds precisely to the name immediately following the first slash of your address. That is to say, placing the action myaction here means the process would be triggered by opening http://127.0.0.1:8080/myaction in your browser.
  • The HTTP service accepts both POST and GET requests. Other than the presence of file attachments, there is little difference in how these are handled. This means that visiting /myaction?id=12345&q=test would be the same as having a form with two <input> fields named, respectively, id and q, and submitting them with the information "12345" and "test". In both cases, this information is located in the XML envelope that is the original input file of a process that starts with a Server Input task.
  • When doing POST requests and uploading files, always make sure to include the "multipart" option in the <form> tag:
    <form action="http://127.0.0.1:8080/myaction" method="POST" enctype="multipart/form-data">
    Otherwise, file attachments will not be received, only their file names.
  • The Mime Type option is better left at Auto-Detect unless the process requires it to be forced to a specific type. This means that if a process can either return a PDF when successful or an HTML page with an error message, it will not attempt to send an HTML with a PDF mime type (which, obviously, would cause confusion).
  • There is no HTTP Server Output task (see below on how to end your process)

Request/process/response cycle

Once a process using the HTTP Server Input task is created, it is important to understand the cycle that is triggered when such a process runs. Note that this is the process when the default HTTP Server Input task options are used (more on how that behavior changes later):

  1. A request is received by the HTTP service.
  2. This request is converted into an XML request file along with one or more attachments when present.
  3. The XML request file and attachments are saved in a local folder, if the HTTP Action is a valid one (otherwise, the files are deleted).
  4. The HTTP service keeps the request from the client open (it does not yet respond to it), and waits.
  5. The HTTP process corresponding to the HTTP Action captures the XML file and attachments and the process begins.
  6. The process runs its course just like any other process would (including subprocesses, send to process, etc).
  7. The very last job file that is active when the process finishes is then returned to the HTTP service.
  8. The HTTP service returns the file to the client and then closes the connection.
  9. If, during this time, the timeout has expired (if the process takes more than 120 seconds), the HTTP service returns a "timeout" to the client, but the process stills finishes on its own. When the process finishes, the return file is ignored by the HTTP service.

Point 7 is critical to understand, as it has an impact on what the client receives. If a process receives a file that is split into multiple parts and each of these parts generates and output, the last split's output will be sent to the client. If the last output task generates a PostScript file for printing, this PostScript is returned to the client.

In most cases, what is returned is what remains after the last task, but only if this task's processing is done in PReS Workflow. For example, if the data file is a text file and this file is sent to PReS Image using the Image connector, it is a text file that is returned, not the output of the Imaging. Similarly, ending a process with the Delete task does not return an empty file, it returns the actual data file.

Actually the most used way of returning a response is this: generate an HTML file using either Create File or Load External File, then use the Delete task as a last output. The HTML is thus returned to the client.

Example HTTP Workflows